Identification of Idioms by Machine Translation: a Hybrid Research System vs. Three Commercial Systems

نویسنده

  • Dimitra Anastasiou
چکیده

We compare three commercial Machine Translation (MT) systems, Power Translator Pro, SYSTRAN, and T1 Langenscheidt, with the research hybrid, statistical and rule-based system, METIS-II, with respect to identification of idioms. Firstly, we make a distinction between continuous (adjacent constituents) and discontinuous idioms (non-adjacent constituents). Secondly, we describe our idiom resources within METIS-II, the system’s identification process, and we evaluate the results with simple techniques. From the translation outputs of the commercial systems we deduce that they cannot identify discontinuous idioms. We prove that, within METIS-II, the identification of discontinuous idioms is feasible, even with low resources.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

The Automatic Translation of Idioms. Machine Translation vs. Translation Memory Systems

Translating idioms is one of the most difficult tasks for human translators and translation machines alike. The main problems consist in recognizing an idiom and in distinguishing idiomatic from non-idiomatic usage. Recognition is difficult since many idioms can be modified and others can be discontinuously spread over a clause. But with the help of systematic idiom collections and special rule...

متن کامل

Translation of verbal idioms

Verbal idioms constitute a challenge for machine translation systems: their meaning is not compositional, preventing a wordfor-word translation, and they can be discontinuous, preventing a match during tokenization. This paper presents the treatment of verbal idioms in our machine translation system, which addresses both challenges by deferring idiom matching until after the parse, and by allow...

متن کامل

Strategies Employed in Translation of Idioms in English Subtitles of Two Persian Television Series

Translation of idioms seems to be complicated for most translators since the meaning of idioms is difficult and sometimes impossible to be deduced from the meaning of their individual components. Considering the difficulties of translation of idioms and also the specific constraints of subtitling such as space and time limits, this research studied the strategies employed in translation of idio...

متن کامل

Translation Of Telugu-Marathi and Vice-Versa using Rule Based Machine Translation

In today’s digital world automated Machine Translation of one language to another has covered a long way to achieve different kinds of success stories. Whereas Babel Fish supports a good number of foreign languages and only Hindi from Indian languages, the Google Translator takes care of about 10 Indian languages. Though most of the Automated Machine Translation Systems are doing well but handl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008